164 research outputs found
NP coordination in underspecified scope representations
Accurately capturing the quantifier scope behaviour of coordinated NPs can be problematic for underspecification systems that define constraints over semantic constructors. We present an extension to a hole-semantics like language that allows a natural representation of coordinated NPs, and a translation from partial scope requirements into constraints on the constructors. We conclude that the efficient decision procedures developed for constraints on semantic constructors enable the possible meanings of sentences containing coordinated NPs to be fully underspecified
Recommended from our members
Rethinking the Agreement in Human Evaluation Tasks
Human evaluations are broadly thought to be more valuable the higher the inter-annotator agreement. In this paper we examine this idea. We will describe our experiments and analysis within the area of Automatic Question Generation. Our experiments show how annotators diverge in language annotation tasks due to a range of ineliminable factors. For this reason, we believe that annotation schemes for natural language generation tasks that are aimed at evaluating language quality need to be treated with great care. In particular, an unchecked focus on reduction of disagreement among annotators runs the danger of creating generation goals that reward output that is more distant from, rather than closer to, natural human-like language. We conclude the paper by suggesting a new approach to the use of the agreement metrics in natural language generation evaluation tasks
Recommended from our members
Mapping networks of influence: tracking Twitter conversations through time and space
The increasing use of social media around global news events, such as the London Olympics in 2012, raises questions for international broadcasters about how to engage with users via social media in order to best achieve their individual missions. Twitter is a highly diverse social network whose conversations are multi-directional involving individual users, political and cultural actors, athletes and a range of media professionals. In so doing, users form networks of influence via their interactions affecting the ways that information is shared about specific global events.
This article attempts to understand how networks of influence are formed among Twitter users, and the relative influence of global news media organisations and information providers in the Twittersphere during such global news events. We build an analysis around a set of tweets collected during the 2012 London Olympics. To understand how different users influence the conversations across Twitter, we compare three types of accounts: those belonging to a number of well-known athletes, those belonging to some well-known commentators employed by the BBC, and a number of corporate accounts belonging to the BBC World Service and the official London Twitter account. We look at the data from two perspectives. First, to understand the structure of the social groupings formed among Twitter users, we use a network analysis to model social groupings in the Twittersphere across time and space. Second, to assess the influence of individual tweets, we investigate the ageing factor of tweets, which measures how long users continue to interact with a particular tweet after it is originally posted.
We consider what the profile of particular tweets from corporate and athletesâ accounts can tell us about how networks of influence are forged and maintained. We use these analyses to answer the questions: How do different types of accounts help shape the social networks? and, What determines the level and type of influence of a particular account
Evaluation methodologies in Automatic Question Generation 2013-2018
In the last few years Automatic Question Generation (AQG) has attracted increasing interest. In this paper we survey the evaluation methodologies used in AQG. Based on a sample of 37 papers, our research shows that the systemsâ development has not been accompanied by similar developments in the methodologies used for the systemsâ evaluation. Indeed, in the papers we examine here, we find a wide variety of both intrinsic and extrinsic evaluation methodologies. Such diverse evaluation practices make it difficult to reliably compare the quality of different generation systems. Our study suggests that, given the rapidly increasing level of research in the area, a common framework is urgently needed to compare the performance of AQG systems and NLG systems more generally
Recommended from our members
Adverse Drug Reaction Classification With Deep Neural Networks
We study the problem of detecting sentences describing adverse drug reactions (ADRs) and frame the problem as binary classification. We investigate different neural network (NN) architectures for ADR classification. In particular, we propose two new neural network models, Convolutional Recurrent Neural Network (CRNN) by concatenating convolutional neural networks with recurrent neural networks, and Convolutional Neural Network with Attention (CNNA) by adding attention weights into convolutional neural networks. We evaluate various NN architectures on a Twitter dataset containing informal language and an Adverse Drug Effects (ADE) dataset constructed by sampling from MEDLINE case reports. Experimental results show that all the NN architectures outperform the traditional maximum entropy classifiers trained from n-grams with different weighting strategies considerably on both datasets. On the Twitter dataset, all the NN architectures perform similarly. But on the ADE dataset, CNN performs better than other more complex CNN variants. Nevertheless, CNNA allows the visualisation of attention weights of words when making classification decisions and hence is more appropriate for the extraction of word subsequences describing ADRs
Using discovered, polyphonic patterns to filter computer-generated music
A metric for evaluating the creativity of a music-generating system is presented, the objective being to generate mazurka-style music that inherits salient patterns from an original excerpt by FrĂ©dĂ©ric Chopin. The metric acts as a filter within our overall system, causing rejection of generated passages that do not inherit salient patterns, until a generated passage survives. Over fifty iterations, the mean number of generations required until survival was 12.7, with standard deviation 13.2. In the interests of clarity and replicability, the system is described with reference to specific excerpts of music. Four conceptsâMarkov modelling for generation, pattern discovery, pattern quantification, and statistical testingâare presented quite distinctly, so that the reader might adopt (or ignore) each concept as they wish
Recommended from our members
A comparative evaluation of algorithms for discovering translational patterns in Baroque keyboard works
We consider the problem of intra-opus pattern discovery, that is, the task of discovering patterns of a specified type within a piece of music. A music analyst undertook this task for works by Domenico Scarlattti and Johann Sebastian Bach, forming a benchmark of 'target' patterns. The performance of two existing algorithms and one of our own creation, called SIACT, is evaluated by comparison with this benchmark. SIACT out-performs the existing algorithms with regard to recall and, more often than not, precision. It is demonstrated that in all but the most carefully selected excerpts of music, the two existing algorithms can be affected by what is termed the 'problem of isolated membership'. Central to the relative success of SIACT is our intention that it should address this particular problem. The paper contrasts string-based and geometric approaches to pattern discovery, with an introduction to the latter. Suggestions for future work are given
Recommended from our members
ComTax: community-driven curation for taxonomic databases
This poster presents the work of the ComTax project to develop a community-driven curation process among practicing scientists and citizen scientists. The project provides tools to help scientists identify and validate appropriate taxonomic names from the scanned historical literature. The system operates on scanned documents, typically taken from the Biodiversity Heritage Library, although documents sourced from other repositories could be used.
The system is intended to be used on uncorrected text after optical character recognition (OCR) on the scanned images. The key stages are:
1. Identify possible taxonomic names in the scanned text using machine learning techniques.
2. Verify the extracted names against existing databases. If present, the source scanned text can be automatically marked-up with the name.
3. Unverified names might mean they are not currently recorded in the verification databases, typically because the old name in the literature has been reclassified, or because erroneous OCR means that the name is incorrectly transcribed in the scanned text. In either case:
3.1. Present the proposed name to domain experts or citizen scientists for validation or correction, potentially through a voting mechanism to collect expert judgments on the putative taxonomic name.
3.2. Mark-up the scanned text with the corrected spelling of the name and offer validated taxonomic names for further use by the community.
This poster will describe the technical challenges facing the ComTax project, and highlight potential extensions of the work to the curation of other entities of interest in the legacy literature or of different disciplines
- âŠ